Skip to content

SCRUM-5950 index LTP variants for search#1556

Merged
oblodgett merged 1 commit intostagefrom
SCRUM-5950
Apr 7, 2026
Merged

SCRUM-5950 index LTP variants for search#1556
oblodgett merged 1 commit intostagefrom
SCRUM-5950

Conversation

@oblodgett
Copy link
Copy Markdown
Member

Summary

  • Add VariantSearchResultConverter to VariantSummaryCurationIndexer so low throughput variant HGVS names are indexed into search
  • Fixes missing search results for LTP variants on stage (e.g., NC_003282.8:g.4017148_4017857del, NC_003281.10:g.3000442G>A)

Test plan

  • Run indexer on stage and verify LTP variant HGVS names return search results
  • Verify existing HTP variant search results are unaffected

Add VariantSearchResultConverter to VariantSummaryCurationIndexer so
low throughput variant HGVS names are indexed and searchable.
@oblodgett oblodgett requested a review from a team as a code owner April 7, 2026 12:31
@claude
Copy link
Copy Markdown

claude bot commented Apr 7, 2026

Code Review

The change is well-scoped and the overall approach (converting VariantSummaryDocumentVariantSearchResultDocument and indexing both into the shared site index differentiated by category) is consistent with the codebase's existing patterns.

Bug: Unsafe chained access in converter can crash the indexer process

In VariantSearchResultConverter.java:21:

CuratedVariantGenomicLocationAssociation variantLocation = doc.getVariants().get(0).getCuratedVariantGenomicLocations().get(0);
if (variantLocation == null) {
    continue;
}

If any VariantSummaryDocument has:

  • getVariants() returning nullNullPointerException
  • getVariants() returning an empty list → IndexOutOfBoundsException
  • get(0).getCuratedVariantGenomicLocations() returning null or empty → same

The null check on line 22 only guards against variantLocation itself being null, but the exception would be thrown before reaching that check. Since the catch block in startSingleThread calls System.exit(-1), a single malformed document would kill the entire indexer process mid-run.

Consider guarding the chained access, e.g.:

if (doc.getVariants() == null || doc.getVariants().isEmpty()) continue;
var variant = doc.getVariants().get(0);
if (variant.getCuratedVariantGenomicLocations() == null || variant.getCuratedVariantGenomicLocations().isEmpty()) continue;
CuratedVariantGenomicLocationAssociation variantLocation = variant.getCuratedVariantGenomicLocations().get(0);

Everything else looks correct — the two document types use different primary keys so there's no overwrite risk in the shared index, and the converter logic for extracting genes, consequences, and cross-references is sound.

@oblodgett oblodgett merged commit a112c2b into stage Apr 7, 2026
5 checks passed
@oblodgett oblodgett deleted the SCRUM-5950 branch April 7, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants